Determining if two documents are written by the same author

نویسندگان

  • Moshe Koppel
  • Yaron Winter
چکیده

Almost any conceivable authorship attribution problem can be reduced to one fundamental problem: whether a pair of (possibly short) documents were written by the same author. In this article, we offer an (almost) unsupervised method for solving this problem with surprisingly high accuracy. The main idea is to use repeated feature subsampling methods to determine if one document of the pair allows us to select the other from among a background set of “impostors” in a sufficiently robust manner.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Determining if Two Documents are by the Same Author

Almost any conceivable authorship attribution problem is reducible to one fundamental problem: was a pair of (possibly short) documents written by the same author. In this paper, we offer an (almost) unsupervised method for solving this problem with surprisingly high accuracy. The main idea is to use repeated feature sub-sampling methods to determine if one document of the pair allows us to sel...

متن کامل

Two Documents Related to the Architecture of Estarabad Village Baths

Written sources on Iranian pre-modern architecture in Islamic middle ages are  very scarce. The material that is scattered in literary and historic texts is normally limited to monumental architecture, and mostly related to the architectural works and not to the process and the agents. Moreover, there is not much material on people's and vernacular architecture. However, there are surviving doc...

متن کامل

Two Documents Related to the Architecture of Estarabad Village Baths

Written sources on Iranian pre-modern architecture in Islamic middle ages are  very scarce. The material that is scattered in literary and historic texts is normally limited to monumental architecture, and mostly related to the architectural works and not to the process and the agents. Moreover, there is not much material on people's and vernacular architecture. However, there are surviving doc...

متن کامل

Concept and conditions of alternative performance of contractual obligations in Iranian and modern international documents

The approach of modern international documents such as DCFR, PELC, OHADA, UPICC, CISG in fulfilling the obligations is to increase the probability and chance of fulfilling the obligation by the obligor by accepting multiple obligations and the possibility of choosing one obligation by the obligee or the obligee to fulfill the obligation. In a function called the alternative performance of contr...

متن کامل

Particle Swarm Model Selection for Authorship Verification

Authorship verification is the task of determining whether documents were or were not written by a certain author. The problem has been faced by using binary classifiers, one per author, that make individual yes/no decisions about the authorship condition of documents. Traditionally, the same learning algorithm is used when building the classifiers of the considered authors. However, the indivi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIST

دوره 65  شماره 

صفحات  -

تاریخ انتشار 2014